3,048 research outputs found
Spatialized teleconferencing: recording and \u27Squeezed\u27 rendering of multiple distributed sites
Teleconferencing systems are becoming increasing realistic and pleasant for users to interact with geographically distant meeting participants. Video screens display a complete view of the remote participants, using technology such as wraparound or multiple video screens. However, the corresponding audio does not offer the same sophistication: often only a mono or stereo track is presented. This paper proposes a teleconferencing audio recording and playback paradigm that captures the spatial location of the geographically distributed participants for rendering of the remote soundfields at the users\u27 end. Utilizing standard 5.1 surround sound playback, this paper proposes a surround rendering approach that `squeezes\u27 the multiple recorded soundfields from remote teleconferencing sites to assist the user to disambiguate multiple speakers from different participating sites
Varying microphone patterns for meeting speech segmentation using spatial audio cues
Meetings, common to many business environments, generally involve stationary participants. Thus, participant location information can be used to segment meeting speech recordings into each speaker’s ‘turn’. The authors’ previous work proposed the use of spatial audio cues to represent the speaker locations. This paper studies the validity of using spatial audio cues for meeting speech segmentation by investigating the effect of varying microphone pattern on the spatial cues. Experiments conducted on recordings of a real acoustic environment indicate that the relationship between speaker location and spatial audio cues strongly depends on the microphone pattern
Using spatial audio cues from speech excitation for meeting speech segmentation
Multiparty meetings generally involve stationary participants. Participant location information can thus be used to segment the recorded meeting speech into each speaker\u27s \u27turn\u27 for meeting \u27browsing\u27. To represent speaker location information from speech, previous research showed that the most reliable time delay estimates are extracted from the Hubert envelope of the linear prediction residual signal. The authors\u27 past work has proposed the use of spatial audio cues to represent speaker location information. This paper proposes extracting spatial audio cues from the Hubert envelope of the speech residual for indicating changing speaker location for meeting speech segmentation. Experiments conducted on recordings of a real acoustic environment show that spatial cues from the Hubert envelope are more consistent across frequency subbands and can clearly distinguish between spatially distributed speakers, compared to spatial cues estimated from the recorded speech or residual signal
Time delay estimation of reverberant meeting speech: on the use of multichannel linear prediction
Effective and efficient access to multiparty meeting recordings requires techniques for meeting analysis and indexing. Since meeting participants are generally stationary, speaker location information may be used to identify meeting events e.g., detect speaker changes. Time-delay estimation (TDE) utilizing cross-correlation of multichannel speech recordings is a common approach for deriving speech source location information. Research improved TDE by calculating TDE from linear prediction (LP) residual signals obtained from LP analysis on each individual speech channel. This paper investigates the use of LP residuals for speech TDE, where the residuals are obtained from jointly modeling the multiple speech channels. Experiments conducted with a simulated reverberant room and real room recordings show that jointly modeled LP better predicts the LP coefficients, compared to LP applied to individual channels. Both the individually and jointly modeled LP exhibit similar TDE performance, and outperform TDE on the speech alone, especially with the real recordings
Recommended from our members
Smoking affects gene expression in blood of patients with ischemic stroke.
ObjectiveThough cigarette smoking (CS) is a well-known risk factor for ischemic stroke (IS), there is no data on how CS affects the blood transcriptome in IS patients.MethodsWe recruited IS-current smokers (IS-SM), IS-never smokers (IS-NSM), control-smokers (C-SM), and control-never smokers (C-NSM). mRNA expression was assessed on HTA-2.0 microarrays and unique as well as commonly expressed genes identified for IS-SM versus IS-NSM and C-SM versus C-NSM.ResultsOne hundred and fifty-eight genes were differentially expressed in IS-SM versus IS-NSM; 100 genes were differentially expressed in C-SM versus C-NSM; and 10 genes were common to both IS-SM and C-SM (P < 0.01; |fold change| ≥ 1.2). Functional pathway analysis showed the 158 IS-SM-regulated genes were associated with T-cell receptor, cytokine-cytokine receptor, chemokine, adipocytokine, tight junction, Jak-STAT, ubiquitin-mediated proteolysis, and adherens junction signaling. IS-SM showed more altered genes and functional networks than C-SM.InterpretationWe propose some of the 10 genes that are elevated in both IS-SM and C-SM (GRP15, LRRN3, CLDND1, ICOS, GCNT4, VPS13A, DAP3, SNORA54, HIST1H1D, and SCARNA6) might contribute to increased risk of stroke in current smokers, and some genes expressed by blood leukocytes and platelets after stroke in smokers might contribute to worse stroke outcomes that occur in smokers
- …